Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Datasources - Part 1: Rework datasource classes #3101

Merged
merged 9 commits into from
Jul 19, 2024

Conversation

rolnico
Copy link
Member

@rolnico rolnico commented Jul 10, 2024

Please check if the PR fulfills these requirements

  • The commit message follows our guidelines
  • Tests for the changes have been added (for bug fixes / features)
  • Docs have been added / updated (for bug fixes / features)

Does this PR already have an issue describing the problem?
No

What kind of change does this PR introduce?
Feature

What is the current behavior?
All the existing DataSources based on files extend FileDataSources. The difference between datasources considering files in a directory and the datasources considering files in an archive is not clear enough.

What is the new behavior (if this is a feature change)?
Two types of DataSources based on files now exists:

  • DirectoryDataSource (inherited by GzDirectoryDataSource, Bzip2DirectoryDataSource, ZstdDirectoryDataSource, XZDirectoryDataSource): considers files in a directory
  • AbstractArchiveDataSource (only ZipArchiveDataSource for now but a TarArchiveDataSource could be added later): considers files in an archive.

Does this PR introduce a breaking change or deprecate an API?

  • Yes
  • No

If yes, please check if the following requirements are fulfilled

  • The Breaking Change or Deprecated label has been added
  • The migration steps are described in the following section

What changes might users need to make in their application due to this PR? (migration steps)
Two types of DataSources based on files now exists:

  • DirectoryDataSource (inherited by GzDirectoryDataSource, Bzip2DirectoryDataSource, ZstdDirectoryDataSource, XZDirectoryDataSource): considers files in a directory
  • AbstractArchiveDataSource (only ZipArchiveDataSource for now but a TarArchiveDataSource could be added later): considers files in an archive.

Therefore, multiple classes have changed, please update your code if you used them directly:

Old classes New classes
Bzip2FileDataSource Bzip2DirectoryDataSource
FileDataSource DirectoryDataSource
GzFileDataSource GzDirectoryDataSource
XZFileDataSource XZDirectoryDataSource
ZipFileDataSource ZipArchiveDataSource
ZstdFileDataSource ZstdDirectoryDataSource

Note: Their constructors are not the same so you will have to adapt your code to the new ones, depending on which DataSource you use.

Other information:

Signed-off-by: Nicolas Rol <nicolas.rol@rte-france.com>
@rolnico rolnico added the Breaking Change API is broken label Jul 10, 2024
@rolnico rolnico self-assigned this Jul 10, 2024
@rolnico rolnico changed the title Rework datasource classes Datasources - Part 1: Rework datasource classes Jul 10, 2024
@rolnico rolnico requested review from jonenst and flo-dup July 10, 2024 15:51
@rolnico
Copy link
Member Author

rolnico commented Jul 10, 2024

This is the first part of the division of the PR #2959

Signed-off-by: Nicolas Rol <nicolas.rol@rte-france.com>
@Override
public String getBaseName() {
return baseName;
DirectoryDataSource(Path directory, String baseName,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add also

DirectoryDataSource(Path directory, String baseName, CompressionFormat compressionFormat) 

?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it wouldn't be used so I'm not sure we should add it.

* @return true if the file exists, else false
*/
@Override
public boolean exists(String suffix, String ext) throws IOException {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move to AbstractFileSystemDatasource and remove the same method from DirectoryDataSource?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved

…ataSource

Signed-off-by: Nicolas Rol <nicolas.rol@rte-france.com>

UnsupportedOperationException exception = assertThrows(UnsupportedOperationException.class, () -> {
try (OutputStream ignored = dataSource.newOutputStream("foo.bar", true)) {
fail();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need to fail explicitly ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not needed explicitly but it need something in the try() that cannot throw a runtime exception


static Stream<Arguments> provideArgumentsForWriteThenReadTest() {
return Stream.of(
Arguments.of("foo", "iidm", CompressionFormat.ZIP),
Copy link
Contributor

@jonenst jonenst Jul 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should use the same semantics for the "mainExtension" as other test classes

        Arguments.of("foo", "iidm", CompressionFormat.ZIP),

vs

        Arguments.of("foo", ".iidm", CompressionFormat.ZSTD),

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(or just remove the mainExtension concept from this PR entirely, it's just used for tests and you can put everything in one string)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

rolnico added 2 commits July 15, 2024 16:31
Signed-off-by: Nicolas Rol <nicolas.rol@rte-france.com>
Signed-off-by: Nicolas Rol <nicolas.rol@rte-france.com>
static Stream<Arguments> provideArgumentsForClassAndListingTest() {
return Stream.of(
Arguments.of("foo.iidm", CompressionFormat.BZIP2, Bzip2DirectoryDataSource.class,
Set.of("foo", "foo.txt", "foo.iidm", "foo.xiidm", "foo.v3.iidm", "foo.v3", "foo_bar.iidm", "foo_bar",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe factorize the sets as they are the same? which would highlight the fact that so far the listed files are the same for the 3 parameterized tests.
Same comment for other test classes.

protected Path testDir;
protected Set<String> unlistedFiles;
protected Set<String> existingFiles;
protected ArchiveFormat archiveFormat;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this parameter should rather be in AbstractArchiveDataSourceTest


// Create an archive when needed
if (archiveFormat != null) {
createArchiveAndFiles(fileName);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should rather be in the setup of AbstractArchiveDataSourceTest

rolnico added 3 commits July 19, 2024 14:39
Signed-off-by: Nicolas Rol <nicolas.rol@rte-france.com>
Signed-off-by: Nicolas Rol <nicolas.rol@rte-france.com>
Signed-off-by: Nicolas Rol <nicolas.rol@rte-france.com>
Copy link

@flo-dup flo-dup merged commit 98da936 into main Jul 19, 2024
7 checks passed
@flo-dup flo-dup deleted the nro/datasources_reorganisation branch July 19, 2024 13:55
@flo-dup flo-dup mentioned this pull request Aug 9, 2024
7 tasks
geofjamg pushed a commit that referenced this pull request Sep 18, 2024
* reorganize datasource classes
* add tests

Signed-off-by: Nicolas Rol <nicolas.rol@rte-france.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Breaking Change API is broken
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants